A Comparison of Non-informative Priors for Bayesian Networks Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150

نویسندگان

  • Tomi Silander
  • Henry Tirri
چکیده

We consider Bayesian and information-theoretic approaches for determining non-informative prior distributions in a parametric model family. The information-theoretic approaches are based on the recently modiied deenition of stochastic complexity by Rissanen, and on the Minimum Message Length (MML) approach by Wallace. The Bayesian alternatives include the uniform prior, and various equivalent sample size priors. In order to be able to empirically compare the diierent approaches in practice, the methods are instantiated for a model family of practical importance, the family of Bayesian networks. The results with several public domain datasets show that the choice of the prior distribution can have a significant eeect on the results obtained, especially if the amount of the data available is small. Inspired by our empirical observations, we also introduce a new heuristics for determining the prior distribution. The empirical results show that the heuristics gives consistently very good results with respect to the results obtained by alternative methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Latent Semantic Kernels for Feature Selection Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150

Latent Semantic Indexing is a method for selecting informative subspaces of feature spaces. It was developed for information retrieval to reveal semantic information from document co-occurrences. The paper demonstrates how this method can be implemented implicitly to a kernel deened feature space and hence adapted for application to any kernel based learning algorithm and data. Experiments with...

متن کامل

Discrete versus Analog Computation: Aspects of Studying the Same Problem in Diierent Computational Models Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150

In this tutorial we want to outline some of the features coming up when analyzing the same computational problems in diierent complexity theoretic frameworks. We will focus on two problems; the rst related to mathematical optimization and the second dealing with the intrinsic structure of complexity classes. Both examples serve well for working out in how far diierent approaches to the same pro...

متن کامل

Multiplicative Updatings for Support-vector Learning Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150

Support Vector machines nd maximal margin hyperplanes in a high dimensional feature space. Theoretical results exist which guarantee a high generalization performance when the margin is large or when the number of support vectors is small. Multiplicative-Updating algorithms are a new tool for perceptron learning whose theoretical properties are well studied. In this work we present a Multiplica...

متن کامل

Dynamically Adapting Kernels in Support Vector Machines Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150

The kernel-parameter is one of the few tunable parameters in Support Vector machines, and controls the complexity of the resulting hypothesis. The choice of its value amounts to model selection, and is usually performed by means of a validation set. We present an algorithm which can automatically perform model selection and learning with no additional computational cost and with no need of a va...

متن کامل

Data-dependent Structural Risk Minimisation for Perceptron Decision Trees Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150

Perceptron Decision Trees (also known as Linear Machine DTs, etc.) are analysed in order that data-dependent Structural Risk Minimization can be applied. Data-dependent analysis is performed which indicates that choosing the maximal margin hyperplanes at the decision nodes will improve the generalization. The analysis uses a novel technique to bound the generalization error in terms of the marg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998